How (Not) To Train Your Neural Network Using the Information Bottleneck Principle

نویسندگان

  • Rana Ali Amjad
  • Bernhard C. Geiger
چکیده

In this theory paper, we investigate training deep neural networks (DNNs) for classification via minimizing the information bottleneck (IB) functional. We show that, even if the joint distribution between continuous feature variables and the discrete class variable is known, the resulting optimization problem suffers from two severe issues: First, for deterministic DNNs, the IB functional is infinite for almost all weight matrices, making the optimization problem ill-posed. Second, the invariance of the IB functional under bijections prevents it from capturing desirable properties for classification, such as robustness, architectural simplicity, and simplicity of the learned representation. We argue that these issues are partly resolved for stochastic DNNs, DNNs that include a (hard or soft) decision rule, or by replacing the IB functional with related, but more wellbehaved cost functions. We conclude that recent successes reported about training DNNs using the IB framework must be attributed to such solutions. As a side effect, our results imply limitations of the IB framework for the analysis of DNNs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diagnosis of hyperlipidemia in patients based on an artificial neural network with pso algorithm

One of the most common and most dangerous diseases of blood fats are such as heart disease, diabetes and stroke, heart and brain. It can control the timely diagnosis, treatment and then prevention of complications is become very effective even without using medicine. Heart disease and diabetes file if patients has useful information that can be used to estimate blood fat timely diagnosis. In th...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Application of statistical techniques and artificial neural network to estimate force from sEMG signals

This paper presents an application of design of experiments techniques to determine the optimized parameters of artificial neural network (ANN), which are used to estimate force from Electromyogram (sEMG) signals. The accuracy of ANN model is highly dependent on the network parameters settings. There are plenty of algorithms that are used to obtain the optimal ANN setting. However, to the best ...

متن کامل

Using Artificial Neural Network Algorithm to Predict Tensile Properties of Cotton-Covered Nylon Core Yarns

Artificial Neural Networks are information processing systems. Over the past several years, these algorithms have received much attention for their applications in pattern completing, pattern matching and classification and also for their use as a tool in various areas of problem solving. In this work, an Artificial Neural Network model is presented for predicting the tensile &#10properties of ...

متن کامل

Using Artificial Neural Network Algorithm to Predict Tensile Properties of Cotton-Covered Nylon Core Yarns

Artificial Neural Networks are information processing systems. Over the past several years, these algorithms have received much attention for their applications in pattern completing, pattern matching and classification and also for their use as a tool in various areas of problem solving. In this work, an Artificial Neural Network model is presented for predicting the tensile properties of co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1802.09766  شماره 

صفحات  -

تاریخ انتشار 2018